Randomized algorithms for distributed computation of principal component analysis and singular value decomposition
نویسندگان
چکیده
As illustrated via numerical experiments with an implementation in Spark (the popular platform for distributed computation), randomized algorithms provide solutions to two ubiquitous problems: (1) the distributed calculation of a full principal component analysis or singular value decomposition of a highly rectangular matrix, and (2) the distributed calculation of a low-rank approximation (in the form of a singular value decomposition) to an arbitrary matrix. Carefully honed algorithms yield results that are uniformly superior to those of the stock, deterministic implementations in Spark; for instance, whereas the stock software will without warning return left singular vectors that are far from numerically orthonormal, a significantly burnished randomized implementation generates left singular vectors that are numerically orthonormal to nearly the machine precision.
منابع مشابه
Randomized Matrix Decompositions using R
The singular value decomposition (SVD) is among the most ubiquitous matrix factorizations. Specifically, it is a cornerstone algorithm for data analysis, dimensionality reduction and data compression. However, despite modern computer power, massive datasets pose a computational challenge for traditional SVD algorithms. We present the R package rsvd, which enables the fast computation of the SVD...
متن کاملAn implementation of a randomized algorithm for principal component analysis
Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis (PCA) and the calculation of truncated singular value decompositions (SVD). The present paper presents an essentially black-box, fool-proof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated v...
متن کاملHigh-Performance Out-of-core Block Randomized Singular Value Decomposition on GPU
Fast computation of singular value decomposition (SVD) is of great interest in various machine learning tasks. Recently, SVD methods based on randomized linear algebra have shown significant speedup in this regime. This paper attempts to further accelerate the computation by harnessing a modern computing architecture, namely graphics processing unit (GPU), with the goal of processing large-scal...
متن کاملLossy Color Image Compression Based on Singular Value Decomposition and GNU GZIP
In matrix algebra, the Singular value decomposition (SVD) is an factorization of complex matrix that has been applied to principal component analysis, canonical correlation in statistics, the determination of the low rank approximation of matrices. In this paper, using the SVD and the theory of low rank approximation of a matrix, we offer a new scheme for color image compression based on singul...
متن کاملExploratory factor and principal component analyses: some new aspects
Exploratory Factor Analysis (EFA) and Principal Component Analysis (PCA) are popular techniques for simplifying presentation of, and investigating structure of, an (n×p) data matrix. However, these fundamentally different techniques are frequently confused, and the differences between them are obscured, because they give similar results in some practical cases. We therefore investigate conditio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1612.08709 شماره
صفحات -
تاریخ انتشار 2016